Design and construction of the Tracking Written Learner Language (TRAWL) Corpus: A longitudinal and multilingual young learner corpus
نویسندگان
چکیده
This article describes the design and construction of Tracking Written Learner Language (TRAWL) Corpus. The corpus combines several features that are all rare for learner corpora: it is longitudinal, following individual pupils over years; has data from young learners school years 5 to 13 (ages 10–18); multilingual, containing learners’ texts in L3s (French, German Spanish), L2 English L1 Norwegian; includes teacher comments on a number texts. In addition, some exist both first second revised version, tied rich set meta-data. Not only does such offer new possibilities research language acquisition general; can also be used provide valuable insights teachers, training policymaking within national context Norway. this article, we describe TRAWL Corpus outline its uses benefits community. We compilation process hope may inspire enable others build similar corpora their own contexts.
منابع مشابه
Metadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners
Different issues have been probed in learner corpus research since the late 1980s.However, taking the im- portance of meta discourse markers (MDMs) in signposting academic discourse, their use in Iranian EFL learners‟ academic essays is an area of research in need of a more serious analysis. Contributing to this line of investigation, this paper reports a corpus-based study of the use of MDMs i...
متن کاملBuilding a learner corpus
The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked levels to cope with a wide range of error types present in the input. Each level corrects different types of errors; links between the levels allow capturing errors in word order and complex discontinuous expressions. Errors are not only corrected, bu...
متن کاملThe MERLIN corpus: Learner language and the CEFR
The MERLIN corpus is a written learner corpus for Czech, German, and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR) with authentic learner data. The corpus contains 2,290 learner texts produced in standardized language certifications covering CEFR levels A1–C1. The MERLIN annotation scheme includes a wide range of language characteri...
متن کاملThe ASK Corpus - a Language Learner Corpus of Norwegian as a Second Language
In our paper we present the design and interface of ASK, a language learner corpus of Norwegian as a second language which contains essays collected from language tests on two different proficiency levels as well as personal data from the test takers. In addition, the corpus also contains texts and relevant personal data from native Norwegians as control data. The texts as well as the personal ...
متن کاملThe Jinan Chinese Learner Corpus
We present the Jinan Chinese Learner Corpus, a large collection of L2 Chinese texts produced by learners that can be used for educational tasks. The present work introduces the data and provides a detailed description. Currently, the corpus contains approximately 6 million Chinese characters written by students from over 50 different L1 backgrounds. This is a large-scale corpus of learner Chine...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Nordic Journal of Language Teaching and Learning (formerly NJMLM)
سال: 2023
ISSN: ['2703-8629']
DOI: https://doi.org/10.46364/njltl.v10i2.1005